Using formant frequencies in speech recognition
نویسندگان
چکیده
Formant frequencies have rarely been used as acoustic features for speech recognition, in spite of their phonetic significance. For some speech sounds one or more of the formants may be so badly defined that it is not useful to attempt a frequency measurement. Also, it is often difficult to decide which formant labels to attach to particular spectral peaks. This paper describes a new method of formant analysis which includes techniques to overcome both of the above difficulties. Using the same data and HMM model structure, results are compared between a recognizer using conventional cepstrum features and one using three formant frequencies, combined with fewer cepstrum features to represent general spectral trends. For the same total number of features, results show that including formant features can offer increased accuracy over using cepstrum features only.
منابع مشابه
Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملRecognition of Emotional Speech and Speech Emotion in Farsi
Speech emotion can add extra information to speech in comparison with available textual information. However, it can also lead to some problems in the automatic speech recognition process. We evaluated the changes in speech parameters, i.e. formant frequencies and pitch frequency, due to anger and grief for Farsi language in a former research. Here, using those results, we try to improve emotio...
متن کاملModelling Speech Signals using Formant Frequencies as an Intermediate Representation
This paper concerns Multiple-level Segmental HiddenMarkov Models (M-SHMMs) in which the relationship between symbolic and acoustic representations of speech is regulated by a formant-based intermediate representation. New TIMIT phone recognition results are presented, confirming that the theoretical upper-bound on performance is achieved provided that either the intermediate representation or t...
متن کاملExploring the Effect of Differences in the Acoustic Correlates of Adults' and Children's Speech in the Context of Automatic Speech Recognition
This work explores the effect of mismatches between adults’ and children’s speech due to differences in various acoustic correlates on the automatic speech recognition performance under mismatched conditions. The different correlates studied in this work include the pitch, the speaking rate, the glottal parameters (open quotient, return quotient, and speech quotient), and the formant frequencie...
متن کاملFormant Analysis of Bangla Vowel for Automatic Speech Recognition
To provide new technological benefits to the mass people, nowadays, regional and local language recognition draws attention to the researchers. Similarly to other languages, Bangla speech recognition scheme is demandable. A formant is considered as the resonance frequency of vocal tract. Formant frequencies play an important role for the purpose of automatic speech recognition, due to its noise...
متن کامل